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Abstract 

| In this work, the priority-first sequential-search decoding algorithm proposed in [8] is revisited. By 

replacing the conventional Fano metric by one that is derived based on the Wagner rule, the sequential- 
■ search decoding in [8] guarantees the maximum-likelihood (ML) performance, and hence, was named 

o : 

the maximum-likelihood sequential decoding algorithm (MLSDA). It was then concluded by simulations 

o 

that the software computational complexity of the MLSDA is in general considerably smaller than that 

O ' 

t ■ of the Viterbi algorithm. 

o 

c/3 , A common problem on sequential-type decoding is that at the signal-to-noise ratio (SNR) below 

O 

the one corresponding to the cutoff rate, the average decoding complexity per information bit and the 



1 required stack size grow rapidly with the information length [13]. This problem to some extent prevent 

h ; 

the practical use of sequential-type decoding from convolutional codes with long information sequence at 
low SNRs. In order to alleviate the problem in the MLSDA, we propose to directly eliminate the top path 
whose end node is A-trellis-level prior to the farthest one among all nodes that have been expanded 
thus far by the sequential search, which we termed the early elimination. Following random coding 
argument, we analyze the early-elimination window A that results in negligible performance degradation 
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for the MLSDA. Our analytical results indicate that the required early elimination window for negligible 
performance degradation is just twice of the constraint length for rate one-half convolutional codes. For 
rate one-third convolutional codes, the required early-elimination window even reduces to the constraint 
length. The suggestive theoretical level thresholds almost coincide with the simulation results. As a 
consequence of the small early-elimination window required for near maximum-likelihood performance, 
the MLSDA with early-elimination modification rules out considerable computational burdens, as well 
as memory requirement, by directly eliminating a big number of the top paths, which makes the 
MLSDA with early elimination very suitable for applications that dictate a low-complexity software 
implementation with near maximum-likelihood performance. 

Index Terms 

Sequential decoding, maximum-likelihood, soft-decision, random coding 

I. Introduction 

One of the most commonly used decoding algorithms for convolutional codes is the Viterbi 
algorithm. It operates on a convolutional code trellis, and has been shown to be a maximum- 
likelihood decoder [13]. Since its decoding complexity grows exponentially with the code con- 
straint length, the Viterbi algorithm is usually applied only for convolutional codes with short 
constraint lengths. 

When the information sequence is long, path truncation was suggested for a practical imple- 
mentation of the Viterbi decoder [13]. Instead of keeping all trellis branches on the survivor paths 
in the decoder memory, only a certain number of the most recently trellis branches is retained, 
and a decision is forced on the oldest trellis branch whenever a new data arrives the decoder. In 
literatures, three strategies have been proposed on the forceful decision: (1) majority vote strategy 
that traces back from all states and outputs the decision that occurs most often; (2) best state 
strategy that only traces back from the state with the best metric and outputs the information bits 
corresponding to the path being traced; (3) random state strategy that randomly traces back from 
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one state and outputs the information bits corresponding to the path being traced. Although none 
of the three forceful strategies guarantees maximum-likelihood, their performance degradation 
can be made negligible as long as the traceback window or truncation window is sufficiently 
large. 

In [3], Forney proved that a truncation window of 5.8-fold of the code constraint length 
suffices to provide negligible performance degradation for the best state strategy. Hemmati and 
Costello [11] later derived an upper performance bound as a function of the truncation window, 
and obtained a similar conclusion for the best state strategy. McEliece and Onyszchuk [14] 
studied the tradeoff between length of the truncation window and performance loss for the 
random state strategy, and concluded that the truncation window for the random state strategy 
should be about twice as large as that for the best state strategy. 

On the other hand, the sequential decoding algorithm has received little attention in the past 
30 years due to its sub-optimum performance and lack of efficient and cost-effective hardware 
implementation. However, because its decoding complexity is irrelevant to the code constraint 
length, the sequential decoding algorithm is suitable for convolutional codes with large memory 
order. For this reason, it has been recently proposed to be used in the decoding of the so-called 
"super-code" that considers the joint effect of multi-path channel and convolutional codes [10]. 

Based on the Wagner rule, a variant of the sequential decoding algorithm has been established, 
and was proved to be maximum-likelihood [8]. As a result, the new sequential-type decoding al- 
gorithm is termed the maximum-likelihood sequential decoding algorithm (MLSDA) for referring 
convenience. By simulations, the authors in [8] observed that, from pure software implementation 
standpoint, the average decoding complexity of the MLSDA is in general considerably smaller 
than the Viterbi algorithm when the signal-to-noise ratio (SNR) of the additive white Gaussian 
noise (AWGN) channel is larger than 2 dB. 

Similar to the Viterbi algorithm, the decoding burdens of the sequential decoding algorithm, 
both in memory consumption and in computational complexity, grows as the length of the infor- 
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mation sequence increases. However, in order to compensate the SNR loss due to the additional 
zeros at the end of the information sequence, a long information sequence is often preferred 
in practice. One solution to reduce the decoding burdens as a result of a long information 
sequence is to introduce the path truncation concept of the Viterbi algorithm to the sequential 
decoding algorithm. As one example, Zigangirov [19] derived an error probability upper bound 
of the sequential decoding with backsearch limit, in which the decoder traces back the top path in 
stack to output the forceful decisions of the symbols at those levels prior to the backsearch limit. 
Under the situation that the channel critical rate is smaller than (ac — l)/« of the computational 
cutoff rate, where k is the ratio of backsearch limit against code constraint length, Zigangirov's 
bound was shown to reduce to the Yudkin- Viterbi bound [5] for infinite backsearch limit at low 
to medium rates, and to coincide with the random coding bound at high rate [19]. 

In this paper, an alternative approach to lower the decoding complexity of the sequential 
decoding is examined. Instead of tracing back the top path in stack to force the decision of 
the symbols beyond the backsearch limit, we propose to directly eliminate the top path whose 
end node is A-level-prior to the farthest node among all nodes that have been expanded thus 
far by the sequential search, which is named the early elimination. Following similar random 
coding argument used by Forney [3], we analyze the early-elimination window A that results 
in negligible performance degradation for the MLSDA. Our analytical results indicate that the 
required early elimination window for negligible performance degradation is just twice of the 
constraint length for rate one-half convolutional codes. For rate one-third convolutional codes, 
the required early-elimination window even reduces to the constraint length. Simulations are also 
performed, and they confirm and match the analytical results. 

As a consequence of the small early-elimination window required for near maximum-likelihood 
performance, the MLSDA with early-elimination modification rules out considerable computa- 
tional burdens, as well as memory requirement, by directly eliminating a big number of the top 
paths. Furthermore, it can also be implemented together with the backsearch scheme to provide 
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timely decision of fixed delay, and to further reduce the decoding complexity. This suggests 
the potential and suitability of the MLSDA with early elimination for applications that dictate a 
low-complexity software implementation with near maximum-likelihood performance. 

The rest of the paper is organized as follows. The preliminary results are briefed in Section lU 
The early elimination modification of the MLSDA is presented in Section |inj The analysis of 
the sufficient early elimination window for near-maximum-likelihood performance is given in 
Section [TV] Numerical and Simulation results are remarked in Section |V] Section [Vjj concludes 
this paper. 

Throughout the paper, natural logarithm is assumed except otherwise stated. 

II. Preliminaries 

In this section, we present the system model considered in this work. The technique that 
Forney used to prove that a truncation window of 5.8-fold of the code constraint length is 
sufficient to secure near-optimal performance for the best state strategy is introduced in brief for 
completeness. 

A. System Model and the MLSDA 

Let rQ be a binary (n, k, m) convolutional code with finite input information sequence of k x L 
bits, followed by k x m zeros to clear the encoder memory. Thus, rQ forms an (N, K) linear block 
code with effective code rate R = K/N, where K = kL and N = n(L + m) . Denote the parity 
check matrix of rQ by H. The code rate, the memory order and the constraint length of rQ are given 
by k/n, m and (m + 1), respectively. Put a binary codeword of rQ by v = (v , vi, . . . , v^-i), 
where each Vj E {0, 1}. For notational convenience, we represent a portion of codeword v by 
V( a ,b) — (v a , v a+ i, . . . , Vb), and abbreviate U(o,&) and V( ,N-i) as and v, respectively. Same 
abbreviation will be applied to other vector notations. 
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Assume that the binary codeword is transmitted over a binary-input time-discrete channel with 
channel output r = (r , ri, . . . , rjv-i). Define hard-decision sequence = (y , yi, . . . , Un-i) 
corresponding to r as: 

A J 1, if <Pj < 0; 
0, otherwise 

where 0, = log[Pr(rj|Uj = 0)/Pr(r 3 -|i>j = 1)], and Pr(r,,|u,) denotes the channel transition 
probability of Tj given Vj. According to the Wagner rule, the maximum-likelihood decoding 
output v for received vector r can be obtained by 

v = y@e\ (1) 

where "©" is the bit- wise exclusive-or operation, and e* is the one with the smallest J2f=o e j\ ( f ) j\ 
among all error patterns e € {0,1}^ satisfying eW T = yW T . Here, superscript "T" is used to 
denote the matrix transpose operation. Based on the observation in ©, a new sequential-type 
decoder can be established by replacing the Fano metric in the conventional sequential decoding 
algorithm by a metric defined as: 

ln-1 
3=0 

where cc^ n _x) = (x ,Xi, . . . , x^ n _i) £ {0, l} £n represents the label of a path ending at level £ 
in the (n,k,m) convolutional code tree, and fJ,(xj) = (yj (B Xj)\(f)j\. Since the new decoding 
metric is nondecreasing along the code path, and since finding e* is equivalent to finding the 
code path with the smallest metric in the code tree, it was proved in [8] that the new sequential- 
type decoder can always locate the maximum-likelihood codeword through the greedy-in-nature 
priority-first sequential codeword search. For this reason, the new sequential-type decoder is 
named the maximum-likelihood sequential decoding algorithm (MLSDA) [8]. 

By adding another stack, the MLSDA can be made to operate on a code trellis instead of 
a code tree [8]. The two stacks used in the trellis-based MLSDA are referred to as the Open 
Stack and the Closed Stack. The Open Stack contains all paths that end at the frontier part of the 
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trellis being thus far explored (cf. Fig.|2]). The Open Stack functions similarly as the single stack 
in the conventional sequential decoding algorithm. The Closed Stack stores the information of 
the ending states and ending levels of the paths that had been the top paths of the Open Stack. 
The Closed Stack is used to determine whether two paths intersect in the code trellis during the 
sequential search. The trellis-based MLSDA [8] is quoted below for completeness. 

< Trellis-Based MLSDA > 

Step 1 . Load the Open Stack with the origin node whose metric is zero. 

Step 2. Put into the Closed Stack both the state and level of the end node of the top path in the 
Open Stack. Compute the path metric for each of the successor paths of the top path in 
the Open Stack by adding the branch metric of the extended branch to the path metric of 
the top path. Delete the top path from the Open Stack. 

Step 3. Discard the successor paths in Step® which end at a node that has the same state and 
level as any entry in the Closed Stack. If any successor path ends at the same node as a 
path already in the Open Stack, eliminate the path with higher path metricW 

Step 4. Insert the remaining successor paths into the Open Stack in order of ascending path met- 
rics. If two paths in the Open Stack have equal metric, sort them in order of descending 
levels. If, in addition, they happen to end at the same level, sort them randomly. 

Step 5. If the top path in the Open Stack reaches the end of the convolutional code trellis, the 
algorithm stops; otherwise go to Step® 

It is known that the computational efforts of the sequential-search decoding algorithms, in- 
cluding the trellis-based MLSDA, are determined not only by the number of metrics computed, 
but also by the cost of searching and inserting of the stack elements. However, the latter cost 
can be made of comparable order to the former by adopting the double-ended heap (DEAP) [2] 

'For discrete channels, it may occur that the successor path not only ends at the same node as some path already in the Open 
Stack but also has equal path metric to it. In such case, randomly eliminate one of them. 
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data structure in the stack implementation. This justifies the common usage of number of metric 
computations as the key determinant of the algorithmic complexity of the sequential- search 
decoding algorithm. 

B. Random Coding Analysis of the Path Truncation Window 

In [4], Gallager considered the discrete memoryless channel with input alphabet size /, output 
alphabet size J and channel transition probability Pji, and presented the random coding bound 
for the maximum-likelihood decoding error P e of the (N, K) block code as: 

P e < exp{-iV [-pR + E (p,p)]} 

for all < p < 1, where R = \og(I K )/N = (K/N) log(J) is the code rate measured in nats 
per symbol, p = (pi,P2, ■ - • ,Pi) is the input distribution adopted for the random selection of 
codewords, and ^ 

Eo(p,p)^-logf^(j2^P}/ il+p) ) • (3) 
j=i \i=i J 

Gallager's result leads to the well-known random coding exponent: 

E r (R) = max max[— pR + E (p,p)] = max [— pR + E (p)], 

0<p<l p 0<p<l 

where E (p) = max p £ , (p, p) is the Gallager function [20]. Notably, the random coding ex- 
ponent is a lower bound of the channel reliability function E(R) = liniTv^oo — 0-/N) log(P e ) 
(provided the limit exists), and is tight for code rates above the cutoff rate. 

In [17], Viterbi applied similar random coding argument to the derivation of the decoding 
error for time-varying convolutional codes. Specifically, he considered a single-input n-output 
convolutional encoder with one (m + l)-stage shift register as shown in Fig. [Q The n inner 
product computers may change with each new input symbol, and hence, a time-varying code 
trellis is resulted. As all elements are assumed to be in GF(q), each input symbol will induce 
q branches on the code trellis, and each branch is labelled by n channel symbols. As a result of 
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the attached m zeros at the end, the encoder will produce n(L + m) output channel symbols in 
response to the input sequence of L symbols. Under the above system setting, Viterbi showed 
that the maximum-likelihood decoding error P ec for time-varying convolutional codes can be 
upper-bounded by: 

P e ,c < i q _~_\ /R ex P [-n(m + l)E (p)] (4) 

for all < p < 1, where R = \og(q)/n is the code rate in unit of nats per symbol, and 
A = E (p) — pR is a constant independent of n(m + 1). Since A is required to be positive, it 
can be concluded that: 

liminf--logP ejC >(m+l)E e (R), 

where E C (R) = max{ pg [ 0il ] . e ( p )> p r} E (p). For symmetric channels, E (p) is an increasing 
and concave function in p with E (0) = 0; therefore, E C (R) can be reduced to: 

Ro, if < R < R ; 
E C {R) = { Eo(p*), if Rq < R < C; (5) 
0, if R > C, 

where Rq — E (l) is the cutoff rate, C = E' (0) is the channel capacity, and p* = p*(R) is the 
unique solution of E (p) = pR. It is also shown in the same work that E C (R) is a tight exponent 
for R > R . 

In order to derive the path truncation window with near-optimal performance, Forney [3] 
treated the truncated convolutional code as a block code, and upper-bounded the additional 
decoding error P e> x due to path truncation in the Viterbi decoder by means of Gallager's technique 
as: 

P e ,T < exp[-nrE r (R)\, (6) 

where r is the truncation window size. Forney then noticed that as long as 

1 1 
lim inf log P e r > hm sup log P e c , (7) 
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the additional error P e T due to path truncation becomes exponentially negligible with respect 
to P e )C . For R > R , condition © reduces to 

rE r (R) > (m+l)E c (R) 

by inequality © and the tightness of E C (R). A specific case is given in Fig.[3]in which the binary 
symmetric channel (BSC) with crossover probability 0.4 gives that the path truncation window at 
the cutoff rate R = 0.0146 bit/symbol must be larger than E C (R ) / E r (R ) « 0.0146/0.0025 = 
5.84-fold of the code constraint length. This number parallels the one obtained under the very 
noisy channels, where 5.8-fold of the code constraint length is suggested for the path truncation 
window at the cutoff rate [18]. 

III. Early Elimination Scheme for Priority-First Sequential-Search 

ALGORITHM 

The early elimination modification that we proposed in this work is based on the following 
observation. As shown in Fig. El suppose that the path ending at node C is a portion of the final 
code path to be located at the end of the sequential search, and suppose that the path ending at 
node D happens to be the current top path. Then, expanding node D until all of its offsprings 
finally have decoding metrics exceeding those of the successors of the path ending at node C 
may consume considerable but unnecessary number of computational efforts. This observation 
hints that by setting a proper level threshold A and directly eliminating the top path whose 
level is no larger than (£ max — A), where £ max is the largest level for all nodes that have been 
expanded thus far by the sequential search, the computational complexity of the priority-first 
sequential-search algorithm may be reduced without sacrificing much of the performance. 

It is worth mentioning that if the decoding metric is monotonically nondecreasing along the 
path portion to be searched, it can be guaranteed that the path that updates the current £ max is 
always the one with the smallest path metric among all paths ending at the same level [8]. This 
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is the key to lead to the result that the sequential search using a monotonic maximum-likelihood 
metric like © can assure that the first top path that reaches the last level of the code tree or 
code trellis is exactly the maximum-likelihood code path. 

Based on the above observation, we propose to set a level threshold A in the trellis-based 
MLSDA, and directly eliminate the top path whose level is no larger than (£ max — A). For this 
modification, we only need to modify Step 2 in the trellis-based MLSDA as follows. 
< Trellis-Based MLSDA with Early Elimination Modification 

Initialization. Set a level threshold A. Assign £ max = 0. 

Step 2'. Perform the following check before executing the original Step 2 in the trellis-based 
MLSDA. 

• If the top path in the Open Stack ends at a node whose level is no larger than (£ max — 
A), directly eliminate the top path, and go to Step 5; otherwise, update ^ max if it is 
smaller than the ending level of the current top path. 

IV. Analysis of the Early-Elimination Window with Negligible Performance 

Degradation 

This section provides detailed derivation on the early elimination window that yields negligible 
performance degradation. 

Referring to Fig. |2l suppose that the path that ends at node B is the current top path of the 
Open Stack. Let the current £ max be updated due to the expansion of node C . According to the 
merging operation performed at Step 3 of the trellis-based MLSDA, all the paths survived in the 
Open Stack should follow distinct traces except possibly for forward few branches HI Hence, we 

2 One example of the claimed statement is that the path ending at node B and the path ending at node D have distinct traces 
after node G, and share common branches all the way before node G. This property applies to all the pathes that end at the 
unfilled-circle node in Fig. [2] 

Note that edges EQ and HF were drawn in dotted lines in Fig. [2] because after merging, no pathes in the Open Stack pass 
through them. 
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may assume that the path that ends at node B and the path that updates the current i m3iX have 
common traces before node A, whose level is denoted by £ min for convenience. Let a( ^„ {t , n-i,fa-i) 
and 5;(^ min n-i^ max n-i) be the labels corresponding to path portions AB and AC, respectively. 
With the above setting, we are interested in the additional decoding probability error per node 
(i.e., node A) due to early elimination as following [20]. Without loss of generality, set £ min = 
in the below derivation. 

Observe that the current top path ending at node B is early-eliminated if, and only if, node 
C is expanded earlier than node B, provided £ < (£ max ~ A). Since the decoding metric of 
the MLSDA is monotonically nondecreasing along the path portion to be searched, that node 
C is expanded prior to node B is equivalent to the condition that fi (a;^ n _i)) > /x (^(^ max n-i))> 
which, according to the maximum-likelihoodness of the metrics, is in turn equivalent to: 

Pr (r (te _i)| aj(te_i)) < Pr (r (Wn _i)| tf(Wn-i)) • (8) 

By noting that for the MLSDA, the path that updates the current £ max is exactly the one with 
the smallest path metric among all paths ending at the same level [8], condition © can be 
equivalently re-written as: 

Pr(r ( / n _i)|aj(/ n _i ) ) < _ max Pr (r^n^ aj^n-i)) , (9) 

where ^ max is the set of all labels of length £ max n, whose corresponding paths consist of different 
branches from path AB after node A. Consequently, additional decoding error may be introduced 
by early elimination if © is valid for some £ and £ max with £ < (£ max — A), when x is the 
transmitted codeword!] 

3 Since the early-elimination of path with label x is always performed whenever l[9j is valid, it is clear that additional error 
is introduced only when the transmitted label x corresponds to the maximum-likelihood code path. In other words, when x 
does not label the maximum-likelihood code path, the validity of l|9} or early-elimination of path with label x will not add a 
new error to maximum-likelihood decoding. As what we concern is an upper probability bound for the additional error due to 
early-elimination, it suffices to analyze the probability bound on the occurrence of 

Notably, when equality holds in (|9j, x will still be early-eliminated according to Step [4] of the algorithm. 
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The analysis of the probability bound on the occurrence of © is different from what was 
done by Forney since it compares the channel posterior probabilities given codeword portions 
of "unequal" lengthes, while Forney dealt with the channel posterior probabilities for codewords 
of "equal" length. Refinement on Forney's derivation is therefore necessary. 

For notational convenience, we replace £ max by (3 in the following formulas. The probability 
£(£,(3) that © occurs is given by: 



£{l,P)= $ o(^„_i))Pr( r (/3n-l) | x (f3n~l)) , 

where 1Z is the (possibly discrete or continuous) generic alphabet of r, and 



(10) 



*o (rosn-i)) 



1, if © is valid; 



0, otherwise 



From 



$0 ( r (/3n-l)) < 



Pr(r (/3n _ 1) |^ n _ 1) ) 

*(/3n-l)£^fl 



V(l+P) 



Pr (r(/ n _i)| aj(/„_i)) 



V(i+p) 



for p > 0, 



we obtain: 



Pr(r ()3n _i ) |a; ( ^_ 1) ) 



Pr (»*(/n-l)| *(/n-l)) 



l/(l+p) 



Pr (V(/3„- 



1)| 3J(/3n-l)J > 



E 



(/3n-l)* 



E Pr ( r 09n-l)|*09n-l)) 



x Pr (r (Al _i)| a;^ n _i)) 1/(1+p) Pr (r (te , 



/3n-l) x (en,fln-l) 



) 



where the last step uses Pr (r^ n _i)| cc(/3n-i)) = Pr (r(&,,/9n-i)| cc(M,/3n-i)) Pr (r^-i) | aJ(A»-i)), 
a consequence of the memory less property of the channel. Taking expectation of with 
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respect to random selection of labels (i.e., codewords) of length (j3n — 1) yields that: 



< 



(a) 



E 
E 

'■( fl „-i 1 eR' 3 "- 1 

E 



E Pr ( r (/3n-l)|*(/3n-l) 



1/(1+P) 



Pr(r( fe _i)|:c( ta _i) 



V(i+p) 



Pr ( r 



(£n,/3n-l) | &(ln,Pn-\ 



)) 



E Pr (»*C9n-l)|*09n-l) 



l/(l+p) 



Pr (r(^ n _l)| X(^ n _x)) ^ +P) Pi" (T"(£n,/3n-l)| ^(fn,0n-l)) 



Er> / I - \V(!+P) 

Pr(r (/3 „_i)|a; Wn _ 1) ) 



Pr (r-(te-i)l ai(fe-i)) 1 Pr (r ( 



n,/3n-l) | x {ln,Pn-l) , 



5 £ 

'■( ( 3„-i)eR ? ™- 1 



Pr (,r (/3 „_i)| a; (£„_!)_) 



*(/3n-l)6'"fi/3 



Pr (r^ n _i)| a; (< i n _ 1 )) 1 ^ 1+p ' ) Pr (r^ n ,0n-i)\ X(in,/3n-i)), 



Pr (^(,3n-l) | *(/3n-l)) 



l/(l+p) 



Pr (r ( £„_i)| a3(£„-i)) 1/(1+p) Pr (r (£ „ i/3 „_ 1) | a;^^.!)), 



= (i^ir E 

where (a) holds since labels £c™ n _i) and any labels in ^ are selected independently, and (b) is 
valid due to Jensen's inequality with p < 1. Finally, by using the terminologies in Section [Tl-Bl 
and noting that the number of random labels is at most \Gp\ < i fe/3 = e n/3i? , we obtain: 



' j / i 

j=l \i=l 



(n 



1 3=1 \i=l 



j x p-, 09-/)n 

V(l+P) 

i=l 



J / I 



j=i \i=i 



j=1 \i=l / \i=l 

eX p [-pi? + £ (p,p)]} exp {-(/? - £)n [-pi? + £i(p,p)]} 



i/(i+p) 

i=l 



(/3-€)n 



(ID 



where E (p,p) is defined in (0, and 



>V(i+p) 



j=i \j=i / \i=i 

It is obvious that Ei(p,p) > E (p,p) with equality holds if, and only if, either p = or 



there exists j E [l,J] such that Pji = 1 for every 1 < i < I; thus, E\(p) = Ei(p,p*) > 
Eo(p) = maxp E (p, p), where p* is the input distribution that maximizes E (p,p). An example 
for functions E (p) and -E'i(p) over the BSC with crossover probability 0.045 is given in Fig. |4] 
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Inequality (TTTT) provides an upper probability bound for a top path ending at level £ being 
early-eliminated due to £ max = (3. Based on (fTTI) . we can proceed to derive the bound for the 
probability P £:E that an incorrect codeword is claimed at the end of the sequential search because 
the correct path is early-eliminated during the decoding process. 



Without loss of generality, assume that the all-zero codeword is transmitted. Then, 

/L-A \ 

P e ,E = P e ,E(A) < Pr I Q n i-i is early-eliminated 

L-A 

< Pr(O n ^_i is early-eliminated) 

L-A 

< ex P {~ in \~P R + ^o(p)]} exp {-An [-pR + E 1 (p)}} , 
where the last inequality follows from (fTTI) by taking p = p*, and the observation that (3 — i > A. 
Denoting A = E (p) — pR, we continue the derivation from (fT2l) : 

L-A 

Pe,E < exp{-An[- P R + E^p)]} ^ exp {-£nX} 



oo 

< exp{-An[-p J R + J B 1 (p)]}^exp{-£nA} 

e=i 

= Knexpi-Ani-pR + E^p)}}, (12) 
where K n = e~ nA /(l — e~ nA ) is a constant, independent of A. Consequently, 

liminf logP e)B > A[-pR + E x (p)]+\> A[-pR + E^p)], 

n^oo n 

subject to E (p) > pR with < p < 1, which immediately implies: 

liminf — log P e>£; > A-E d (R), 

where E e i(R) = max{ pe [ 01 ] . e ( p )> p r} [— pR + Ei(p)]. Following similar argument as Forney, we 
conclude that the additional error due to early elimination in the MLSDA becomes exponentially 
negligible if 

A ■ E el (R) > (m + 1)E C (R) or equivalently A/(m + 1) > E C {R) / E d {R) (13) 
for code rates above the channel cutoff rate. 
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V. Numerical and Simulation on Early Elimination Windows for Binary 

Symmetric Channels 

For the binary symmetric channel (BSC) with crossover probability e, 

E (p) = plog(2) - (1 + p) log [eW+'J + (1 - e) 1 '^] , 

and 

E 1 {p) = plog(2) - plog [e 1 ^ + (1 - e) 1 ^ 1 ^] . 

By choosing e = 0.045 and e = 0.095 to respectively approach the cutoff rates 1/2 and 1/3, it 
can be derived from (fl~3l) that the suggested early elimination windows are: 

A > x (m + 1) « 2.00(m + 1) for rate 1/2 codes (14) 

0.250 V ; V ; 

and 

334 

A > x (m + 1) « 1.00(m + 1) for rate 1/3 codes. (15) 

0.333 

The exponent functions E e i(R) and E C (R) for the above BSCs are plotted in Figs. [5] and [6l 
respectively. Conditions ([141) and (IT3T ) indicate that for (2,1,6) and (3,1,8) convolutional codes, 
respectively taking A = 15 and A = 10 should suffice to result in negligible performance 
degradation at the cutoff rates. Simulations are then performed and summarized in Figs. [7J and 
[8] to examine the analytical results. 

It can be observed from Fig. [7J that the MLSDA with early elimination window A = 15 
does exhibit negligible performance degradation for all Eb/No's simulated, where we take e = 



\&Ac{yJ Eb/ Nq) as a convention, and erfc(-) is the complementary error function. This result 
exactly coincides with our theoretical analysis in (fl~4l) . 

From Fig. [8l we observe that it requires an early elimination window A = 12 in order to 
maintain the maximum-likelihood performance for all signal-to-noise ratios simulated, and the 
analytical A = 10 can only provide negligible performance degradation for E^/Nq < 3 dB. For 
example, the bit error rates (BERs) for A = 10 and A = 20 are 1.40 x 10~ 3 and 1.04 x 10" 3 , 
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respectively, at E b /N = 4 dB. This hints that when channels become less noisy, the contribution 
of the errors due to early elimination is a little more dominant to the overall performance than that 
of the maximum-likelihood decision errors. Hence, a slightly larger early elimination window 
than the analytical one A = 10 is necessary to bring down the dominance of the early elimination 
errors. 

The above "under-estimation" of the early elimination window is quite different from Forney's 
estimation of the sufficient path truncation window, for which an over-estimation is usually 
resulted. The under-estimation of early elimination windows, as well as the over-estimation 
of path truncation windows, can be reasoned from the multiplicative constants in the error 
probability bounds. In Forney's argument, it requires: 

exp {-nrE r (R)} < - _ — ^ exp {-n{m + 1)E {R)} . (16) 



where q = 2 for the BSCs. In our derivation, we demand: 

1 _ e -nA ex P {-n^E el (R)} < - _ 2 _ A/fl exp {-n(m + l)E Q {p)} 



or equivalently, 

exp{-nAE e (R)} < 2 A/K exp {-n(m + l)£ (p)} , (17) 

since R = (l/n)log(2) nat/symbol for (n,l,m) binary convolutional codes. By taking the 
optimal p that yields the best exponent, i.e., A = E (p) — pR = 0, the multiplicative constant 
1/(1 — 2~ A / /? ) in (fl6l) approaches infinityjj while the multiplicative constant 2 X I R in (fT7l) reduces 
to 1. This explains the reason why our estimate of early elimination windows tends to be exact 
(e.g., for the simulated (2,1,6) convolutional code) or under-estimated (e.g., for the simulated 
(3,1,8) convolutional code), while Forney's estimate of path truncation windows is often larger 
than necessary. 

4 Zigangirov [20] has provided a tighter decoding error bound for P e>c by simultaneously minimizing the multiplicative 
constant, (q — 1)/(1 — q~ x / R ), and the exponential term, exp{— n(m + l)Eo(p)}. Zigangirov's probability bound, however, 
gives the same asymptotic error exponent as l|5} even it is better for finite n(m + 1). 
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The simulated reduction of decoding complexity due to early elimination is plotted in Fig. [9j 
The decoding complexity is measured by the average number of branch metric computations 
per information bit. From Fig. |9l we found that the computational complexity is significantly 
reduced at medium signal-to-noise ratios. For example, at E b /N = 6 dB, the MLSDA with 
early elimination window A = 15 only requires 5.73 metric computations per information bit. 
This number is around one eleventh of 62.7, which is the average computational complexity of 
the MLSDA without early elimination. 

Figure \W\ depicts the the "99.9% Open Stack size" that is defined as the Open Stack size to 
complete 99.9% of the simulation runs without stack overflow. The 99.9% Open Stack size is 
equivalent to the required Open Stack size such that the stack overflow probability is less than 
10~ 3 . It should be mentioned beforehand that the Closed Stack can be implemented by simply 
adopting a bitmap table that uses one bit per node to identify whether a node has once been 
the end node of the top path of the Open Stack. As a result, the Closed Stack consumes much 
less memory than the Open Stack, and hence, only the size of the Open Stack is necessarily 
simulated. 

Several observations can be made from Fig. \\0\ Firstly, the Open Stack size for the MLSDA 
without early elimination may grow large at medium SNRs. As an example, the MLSDA without 
early elimination requires an Open Stack size of 2877 in order to satisfy 99.9% of the 10 5 
simulation runs at E^/Nq = 6 dB. Secondly, unlike the conventional sequential decoding that 
operates on a code tree, the memory requirement of the Open Stack for the trellis-based MLSDA 
without early elimination also decreases at low SNRs. This is because although the decoding 
process visits almost all trellis nodes at low SNRs, most of the visited paths will be directly 
eliminated by path merging (cf. Step 3 of the trellis-based MLSDA). Thirdly, early elimination 
modification flattens the line below which the stack size is sufficient to fulfil 99.9% of the 
simulation runs, and the 99.9% Open Stack size becomes less relevant to SNRs. Meanwhile, 
only one ninth of the 99.9% Open Stack size, specifically 318, is required at E b /N = 6 dB 
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when early elimination is applied to the MLSDA. 

A final observation on Fig. [TO] is that the 99.9% Open Stack size of the MLSDA without early 
elimination converges to its smallest possible value (L+ 1) = 201 when E b /N is beyond 11.5 
dB. It can therefore be expected that the sequential decoding search of the MLSDA shall go 
all the way to the terminal node of the code trellis at E b /N = 11.5 dB. The early elimination 
modification however reduces the SNR at which the 99.9% Open Stack size converges downto 
9.5 dB by eliminating all the pathes deviating from the path corresponding to the final decoding 
decision. 

Finally, we should emphasize that the 99.9% Open stack size is different from the minimum 
stack size required to introduce negligible performance degradation. The traditional way to handle 
stack overflow under finite stack size is to eliminate the paths at the bottom of the stack in order 
to make rooms for the newly arrived paths [13]. For example, by denoting OPENMAX as the 
boundary of the stack size, beyond which the newly added path will push out the path with 
the largest metric when the stack is full [8], Fig. \TT\ shows that OPENMAX= 2560 is sufficient 
to maintain near-ML performance for the MLSDA without early elimination. This number is 
smaller than 2877, i.e., the 99.9% Open Stack size at E b /N = 6 dB. The introduction of the 
early elimination can again lower OPENMAX downto 256. 

VI. Concluding Remarks and Future Work 

In this work, we propose to improve the computational complexity and memory requirement of 
the maximum-likelihood sequential-search decoding algorithm by early elimination. The random 
coding analysis of the sufficient early elimination window for negligible performance degradation, 
as well as the subsequent simulations, confirms our anticipated improvement. Since the MLSDA 
with early elimination is justified to suit applications that dictate a near-ML software decoder 
with limited support in computational power and memory, a future work of practical interest 
will be to apply the MLSDA with early elimination to the "supper-code" for joint multi-path 
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channel equalization and convolution decoding [10]. 
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Fig. 1. Single-input n-output encoder model considered in [17]. All elements are in GF(q), where q is either a 
prime or a power of a prime. 
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Early Elimination Threshold A 
< ► 




# Close Stack 
O Open Stack 

Fig. 2. Early elimination window A in the trellis-based MLSDA. 




Fig. 3. Exponent lower bound E r {R) of the additional error due to path truncation and exponent E C {R) of the 
maximum-likelihood decoding error for time-varying convolutional codes (without path truncation) under the BSC 
with crossover probability 0.4. 
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Binary Symmetric Channel with Crossover Probability 0.045 
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Fig. 4. Functions Eo(p) and Ei(p) under the BSC with crossover probability 0.045. 



Cross Over Probability 0.045, Capacity = 0.735, Cutoff Rate 0.500 bit/symbol 
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Fig. 5. Exponent lower bound E e i{R) of the additional error due to early elimination and exponent E C (R) of 
the maximum-likelihood decoding error for time-varying convolutional codes (without early elimination) under the 
BSC with crossover probability 0.045. 
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Cross Over Probability 0.095, Capacity = 0.547, Cutoff Rate 0.334 bit/symbol 



- E c< R > 




Rate (in bit/symbol) 

Fig. 6. Exponent lower bound E e i{R) of the additional error due to early elimination and exponent E C (R) of 
the maximum-likelihood decoding error for time-varying convolutional codes (without early elimination) under the 
BSC with crossover probability 0.095. 




Fig. 7. Performance for (2,1,6) convolutional codes under different early elimination windows. The generator 
polynomial for (2,1,6) convolutional codes is [554 774] in octal. The message length is infinite, and the backsearch 
limit is set to be r = 40. 
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Fig. 8. Performance for (3,1,8) convolutional codes under different early elimination windows. The generator 
polynomial for (3,1,8) convolutional codes is [557 663 711] in octal. The message length is infinite, and the 
backsearch limit is set to be r = 52. 




Fig. 9. Average branch metric computations per information bit for (2,1,6) convolutional codes under different early 
elimination windows. The message length is L = 200, and no backsearch limit is utilized. Notably, the decoding 
complexity is upper-bounded by 2[2 m L - (m - 2)2 m - 2]/L w 125.4, and is lower-bounded by (2L + m)/L ps 2. 
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Fig. 10. The 99.9% Open Stack size of the MLSDA with and without early elimination for (2,1,6) convolutional 
codes. The message length L is 200, and no back search limit is utilized. 
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Fig. 11. Performances of the MLSDA with and without early elimination subject to the constraint of finite stack 



size for (2,1,6) convolutional codes. OPENMAX is the upper boundary of the Open Stack size. The message length 



is L = 200, and no back search limit is utilized. 
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